Mnist Data with noising

$256$ discrete values between $0$ and $255$, every value is represented.

Evaluation comments

For the metrics the ouput $y \in [-1,1]$ was trasformed to $\hat{y} \in [0,1]$ and then rounded to two decimal places, since we want to better observe the dieffernce betwen runs instead of the absolute value.
For the classifier the data was not converted.


$\epsilon$-accuracy


diff <- |target_set - predicted_set|        // pixelwise difference stored as (10000, 784)
accu <- 0                                   // accumulator
loop elem in  diff                          // for each element in diff i.e. for each number, image (784,)
    accu <- |{i ∈ elem : elem < ε }| / 784  //count how many elements are > ε and average over pixels, i.e. divide by 784

accu <- accu/10000                          // Average over examples in image set

$\epsilon$-outliers


diff <- |target_set - predicted_set|        // pixelwise difference stored as (10000, 784)
accu <- 0                                   // accumulator
loop elem in  diff                          // for each element in diff i.e. for each number, image (784,)
    accu <- |{i ∈ elem : elem > ε }| / 784  //count how many elements are > ε and average over pixels, i.e. divide by 784

accu <- accu/10000                          // Average over examples in image set

mse: mean squared error

$\operatorname{mse} =\frac{1}{m} \sum_{j=1}^{m} \frac{1}{n} (Y_j-\hat{Y_j})^2 = \frac{1}{m} \sum_{j=1}^{m} \frac{1}{n} \sum_{i=1}^{n} (y_i-\hat{y_i})^2$
where:

  • $m$ number of examples i.e. $10000$
  • $\mathbb{Y}=\{Y_1, Y_2, \cdots, Y_m\}$ target image set, i.e. set of $m$ images each image is represented by a vector of length $n$
  • $\mathbb{\hat{Y}}=\{\hat{Y}_1, \hat{Y}_2, \cdots, \hat{Y}_m\}$ predicted image set, i.e. set of $m$ images each image is represented by a vector of length $n$
  • $n$ number of pixels i.e. $784$
  • $Y_j=\{y_{j 1},y_{j 2}, \cdots, y_{j n}\}$ is the target data of image $j$ i.e. $(784,1 )$ vector
  • $\hat{Y_j}=\{\hat{y}_{j 1},\hat{y}_{j 2}, \cdots, \hat{y}_{j n}\}$ is the prediction of image $j$ i.e. $(784, 1)$ vector

mae: mean absolute error

$\operatorname{mae} =\frac{1}{m} \sum_{j=1}^{m} \frac{1}{n} |Y_j-\hat{Y_j}| = \frac{1}{m} \sum_{j=1}^{m} \frac{1}{n} \sum_{i=1}^{n} |y_i-\hat{y_i}|$
where:

  • $m$ number of examples i.e. $10000$
  • $\mathbb{Y}=\{Y_1, Y_2, \cdots, Y_m\}$ target image set, i.e. set of $m$ images each image is represented by a vector of length $n$
  • $\mathbb{\hat{Y}}=\{\hat{Y}_1, \hat{Y}_2, \cdots, \hat{Y}_m\}$ predicted image set, i.e. set of $m$ images each image is represented by a vector of length $n$
  • $n$ number of pixels i.e. $784$
  • $Y_j=\{y_{j 1},y_{j 2}, \cdots, y_{j n}\}$ is the target data of image $j$ i.e. $(784,1 )$ vector
  • $\hat{Y_j}=\{\hat{y}_{j 1},\hat{y}_{j 2}, \cdots, \hat{y}_{j n}\}$ is the prediction of image $j$ i.e. $(784, 1)$ vector

Average maximal difference

Average over all of the maximal diference between pixels in each example.

Maximal difference

Maximum over all of the maximal diference between pixels in each example.

Error distribution


$\epsilon$-accuracy


diff <- |target_set - predicted_set|        // pixelwise difference stored as (10000, 784)
diff <- flatten dif                         // becomes vector of 10000 * 784 different error values
diff <- filter all elem < ε                 // all elements smallet than ε are taken out of the list
create histogram(diff, bin_size)

Histogram of bin size $32$ for $\epsilon= 1/256$, over noise

Histogram of bin size $32$ for $\epsilon= 0.039$, over noise

Histogram of bin size $32$ for $\epsilon= 0.5$, over noise

Histogram of bin size $32$ for $\epsilon= 1$, over noise